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ABSTRACT 


This thesis proposes a new analysis/synthesis procedure for speech and image 
compression. The algorithm applies the discrete wavelet transform to the subject data in 
order to obtain a set of multiresolution wavelet coefficients. The wavelet coefficients are 
then encoded by using a multiresolution codebook designed using the generalized Lloyd 
algorithm. The statistical properties of the wavelet coefficients are utilized to determine 
the number of resolution levels as well as the codebook size at each resolution level. 
Coding results show that the new procedure provides a significant improvement in the 


quality of the reproduced data. The data tested includes speech, image, and transient 
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', INTRODUCTION 


The recent emergence of wavelet transforms has encouraged many researchers to 
seek new application areas for them [Ref. 10-12}. Since discrete wavelet transforms 
(DWT) are similar to subband coding systems and subband coders have been used in 
speech and image compression, wavelets have found immediate application in the 
waveform coding area [Ref. 10-12]. Data compression consists of converting a data 
sequence into a stream of relatively low bit rate data for transmission over a digital 
communication channel or storage in a digital media. The DWT itself does not reduce 
the total bit rate. but it provides a set of wavelet coefficients using an algorithm with 
pyramidal architecture. Since wavelet coefficients at different resolution levels exhibit 
different statistical characteristics, by designing a quantizer for each set, we can reduce 
the bit rate without significant degradation in quality. 

In this thesis, we present a coding algorithm that combines the DWT and vector 
quantization (VQ). VQ is a method of quantizing a set of data samples jointly as a 
vector. The overall procedure is known as an analysis-synthesis method and is extensively 
used in transform and subband coding areas. It involves two steps. First, we use the 
discrete wavelet transform to obtain a set of wavelet coefficients; second, we vector 
quantize these coefficients by using codebooks designed for each set of coefficients. The 
application of wavelet transforms to this technique is relatively new. The DWT has been 


applied to image coding by combining scalar quantization and vector quantization [Ref. 











10]. Another algorithm has been reported that uses lattice vector quantization [Ref. 11]. 


The algonthm we present here uses generalized Lloyd algorithm [Ref. 8] for vector 
quantization of wavelet coefficients. We apply this coding technique both to speech and 
to image data. 

The remainder of this thesis is organized as follows. In Chapter H, we introduce 
the basic principles of wavelet theory. This chapter also covers the algorithms for the 
discrete wavelet transforms in both one and two dimensions. Chapter IT describes the 
basic concepts of vector quantization and presents the vector quantization algorithm used 
in this thesis. In Chapters IV and V, we combine these algorithms to develop coding 


methods for speech and image compression. Chapter VI presents conclusions. 











fl. WAVELET TRANSFORMS 


A. INTRODUCTION 

Currently wavelet transforms are a popular topic of research in various signal 
processing applications. They are viewed as a new way to represent signals as well as 
a new technique for time-frequency analysis. From the multiresolution point of view, 
they provide powerful tools for speech and image coding applications. In this chapter we 
briefly discuss the basic wavelet theory and discrete wavelet transforms with orthogonal 


wavelets. We also explore the biorthogonal case in the last section. 


B. WAVELET THEORY 

Wavelet transforms can be considered an alternative to classical short-time Fourier 
transforms (STFT). The basic difference is that wavelet transforms use variable window 
sizes that change along the frequency range. The STFT analyzes signals by using a single 
window size [Ref. 1}; consequently, the time-frequency resolution is fixed over the entire 
time frequency plane (see Fig. 2.1(a)). Wavelet transforms provide an analysis method 
known as constant-Q or constant-relative bandwidth analysis. In this method we use a 
family of analysis filters where the time resolution increases with the center frequency 
of the filters. As shown in Fig. 2.1(6), wavelet analysis provides better frequency 
resolution at low frequencies and better time resolution at high frequencies. In other 


words, the time-frequency window is flexible such that it automatically narrows at a high 


frequency frequency 


(a) 
Figure 2.1 Coverage of Time-Frequency Plane for (a) STFT and (b) WT. 


center frequency and widens at a low center frequency. However, the area under this 
window is independent of the center frequency. 

In order to perform wavelet transforms, a set of basis functions called wavelets are 
required. These basis functions are obtained from a prototype function, y(t), called the 
mother wavelet, by means of dilations and translations. The basis functions then take the 


form: 


1 t-b 
¥,.0-——¥(—) , (2.1) 
"flay a 


where a and b are the dilation and translation parameters, respectively. A basis function 
becomes a dilated (low frequency) version of the prototype for a large value of a while 


it is a contracted (high frequency) version for a small value of a. The dilation and 





contraction of the mother wavelet in the time domain cause a corresponding contraction 
and dilation in the frequency domain. 
The continuous wavelet transform (CWT) of x(t) is defined as the inner product of 


the form [Ref. 13]: 


W, (a,b) = | x(t), ,(t)dt (2.2) 
where the asterisk indicates complex conjugation. Since the y,, are orthogonal, the 


original signal can be reconstructed by using the inverse relationship: 


x(t) =¢ f f W,(a, bw, ,(t)dadb, (2.3) 
where c is a constant of proportionality. 

So far we have mentioned only the wavelet function, Y(t). For multiresolution 
analysis, we also need another basis function called the scaling function, $(t). The 
scaling function is used to transform the signal from a fine to a coarse scale [Ref. 1]; 
moreover, its operation can be viewed as lowpass filtering while that of wavelet can be 
considered as highpass filtering. The scaling function has the form: 

a 
via] 


Equation 2.1 and 2.4 are similar in form. 


$,,(0 = oF). (2.4) 


1. _ Discretization of Dilation and Translation Parameters 
The continuous wavelet transform has two drawbacks, namely, redundancy 


and impracticality. Both problems can be solved by sampling the dilation and translation 











parameters. This procedure leads to a family of wavelets and scaling functions with 
discrete parameters. 
Let a=a,." and b=kb,p,”, then y(t) and $(t) defined in Eq. 2.1 and Eq. 2.4 


become: 


-m2 


W,y(2) =o” W(ay "t-kb,), (2.5) 


,(t) =a" (ag "t-kb,), (2.6) 
where m, k are integers and a, > 1, b, # O. Inserting Eq. 2.5 into Eq. 2.2 yields 


wavelet coefficients in the discrete form [Ref. 5]: 


W fm,k) =a)” [x(t (ap "t-kb,)at. (2.7) 
The wavelet family is to be chosen so that the original signal can be uniquely 
reconstructed [Ref. 1]. In particular, they must form an orthonormal basis in L7(R), i. 


e., they must satisfy: 


sped pegt 
f Yl OW yfO=[ eatin 2.8) 


0, otherwise. 


For the orthonormality condition, it can be shown that the parameter a, and b, must be 


chosen as a, = 2 and b, = I which imply the dyadic scale [Ref. 1]. 


2. Miultiresolution Analysis 
The idea of multiresolution analysis was introduced by Mallat [Ref. 3]. In his 


study, he used this approach to construct orthonormal wavelets. Assume that the original 


signal x(t) is measurable and has finite energy. We successively decompose x(t) into 








approximation and wavelet coefficients. This decomposition is done by convolving x(t) 


with the scaling and wavelet filters. 
Multiresolution analysis consists of a set of closed subspaces 
{V, | O=m=M} € L’(R), where we express the resolution level of the original signal 
at m=O and the lowest resolution level at m=M [Ref. 5]. The subspaces share certain 
properties: 
® Causality : The vector space V,,,, of resolution level m+J is a subset of the vector 
space V,, of the upper resolution level (m). 


< coarser... C V,,,; C V, C V,,.,; © ... finer > 


®@ Completeness: The vector spaces must satisfy the following operations: 
UV, = L?(R) AN V,, = {0}. 


@ Scaling: The spaces of approximated signals can be computed by scaling each 
approximated signal by the ratio of their resolution values, i.e., if x(t) € V,,, then 
x(2t) © Vi. 

© Orthogonality: V,, = V, ® W,,, where W,, is the orthogonal complement of V,, 
in V,,,; , t.e., V,, L W,,. In other words, V,,, is equivalent to V,, plus some added 


detail corresponding to W,,. Furthermore, this property and the causality property 
together imply that the direct sum of all W,, spans L7(R): 


V_=W, OW, OW, ®... (2.9) 


m+ 


We associate the scaling function $(t) with V,,, and the wavelet function y(t) with 


W,,. We can write (f) as: 


(t) =2)> h(n) o(2t-n), (2.10) 


and similarly we can express y(t) as: 











¥(0) =2>)> g(n) p(2r-n), (2.11) 


where the coefficient set h(n) is the unit impulse response of a lowpass filter while g(n) 
is the unit impulse response of a highpass filter. The filter g(n) is derived from h(n) by: 
g(n) =(-1)"A(1-n). (2.12) 


The filters, h(n) and g(n), must satisfy certain conditions for a perfect reconstruction, 


[Ref. 14]; namely: 


y h(n)=1 (2.13) 
x m=2 (2.14) 
2)> A(n-2m)h(n-2k)=5(m-k). (2.15) 


The wavelet filter, g(n), must also satisfy Eq. 2.14 and 2.15, but in place of Eq. 2.13, 


g(n) must satisfy: 


¥ a(n)=0. (2.16) 


From the orthogonality condition, h(n) and g(n) together satisfy the following equations: 


2 > [h(m -2n)h(k-2n) +g(m -2n)g(k-2n)] =5(m-k) (2.17) 














> h(n-2k)g(n-2m)=0. (2.18) 


Equation 2.10 through 2.18 describe the essential properties of these functions. 


C. DISCRETE WAVELET TRANSFORMS 

The discrete wavelet transform (DWT) was developed for processing discrete-time 
signals (Ref. 3, 4]. The DWT consists of a series of identical operation blocks with both 
time and dilation/translation parameters in discrete form. In this section, we start with 


the DWT for the 1-D case and then extend it to the 2-D case. 


1. One-Dimensional DWT 
The one-half resolution approximation of a discrete signal x(n) can be 

obtained by convolving it with a lowpass filter followed by subsampling by two. This 
forms the basic procedure for decomposing the original sequence into lower resolution 
levels. At the same time, convolving the sequence at each level with a highpass filter 
gives us the wavelet coefficients. Figure 2.2 depicts the pyramidal decomposition scheme 
of the DWT while Figure 2.3 shows the operations at a given resolution level. Each 
block in Fig. 2.2 performs the following operations: 

@ Convolution with the argument filter, (nm) or g(n). 

@ Subsampling (decimation) by two. 

© Normalization by V2. 
The wavelet coefficients are the ones we need to save in order to reconstruct the original 


signal. We also need to save the approximation coefficients of the lowest level, M. In 











other words, the DWT representation of x(n) consists of the coefficient set: {x,,, d,,, 


m=1,..,M}. 


a. Decomposition 
Given a discrete time sequence x(n) € L?(R), the approximation 


coefficients are computed recursively from: 


Xm KM =V¥2 2 h(k-2n)x_(k); m=0,1,....M-1 (2.19) 


where x,,(n) is the signal at resolution level m; x,(n) is the original signal. This operation 
is equivalent to lowpass filtering of x,,(n) followed by subsampling by two. Consider an 
FIR structure for the filter, h(n). The filtering operation in Eq. 2.19 can be represented 


as a matrix vector product: 


X= V2Hx,, (2.20) 


where 


H- ~ A(L-1) A(L-2) - kh) O 9 ca (2.21) 
0 QO A(L-1) -- h(2) ACL) h(O) 


and x,, = [--x,,(-1), X,(O), X,(1), --J'. The matrix, H, must satisfy the condition (Ref. 4]: 


2HH T =J (2.22) 


10 











Figure 2.2 General Decomposition Algorithm of DWT 








X,{n) 





Figure 2.3 One-Step Decomposition Algorithm 
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where J is the identity matrix, and (.)' stands for the transposition operation. Notice the 
similarity between Eq. ? 22 and Eq. 2.14. Using the same algorithm with the wavelet 


filter gives us wavelet coefficients of lower resolution level: 
d,,.(n)=V2)_ a(k-2n)x,(k), (2.23) 
k 


where the coefficient set g(n) is defined by Eq. 2.12. If we build a matrix, G, in a form 
analogous to Eq. 2.21, we have: 

2GG "=I, (2.24) 
and 

HG‘-0 (2.25) 
which follows from Eq. 2.18 


b. Reconstruction 
We know that the subspaces in which the signals are represented satisfy 
V,... = V, ® W,,. This implies that we can reconstruct each higher level by adding the 
projections from the two orthogonal subspaces of the current level. The projections of 


a sequence onto V,, and W,, are given by: 


= =2H7Hx, and d,,=2G"Gx.. (2.26) 


From Eq. 2.9 and 2.25 the projections are orthonormal and span complete subspaces. 


Thus, from Eq. 2.26 and the relation x,, 








2(H "H+G'G)=1 (2.27) 


The reconstruction algorithm is depicted in Fig. 2.4 and has the form: 
x,, )=/29 [h(n -2k)x,(k) + @(n-2k)d,(k)). (2.28) 
k 


This formula is derived as follows, substituting Eq. 2.19 and 2.23 into Eq. 2.28 and 
taking Eq.2.17 into consideration, we observe that the righthand side of Eq. 2.28 


becomes: 


V2>> [h(n -2K)x,, (4) +8(n-24)d, (4)] 
k 
=e 22) AGj-2k)x,,_ D1 +g(n-2k) V2 es g(j-2k)x,,_,())) 
2 > [A(n- kh -2k)+g(n-2k)g(j-24)]x,,_,0) 


ax, (0). 


(2.29) 
This completes the set of formulas needed for the one-dimensional DWT. 
2. Two-Dimensional DWT 
We now extend the one-dimensional DWT to the two-dimensional case. This 
can be done by using multidimensional extensions of wavelets. Assume that we have a 
set of subspaces V,,’ satisfying the multiresolution conditions described in Section B.2. 


Then, we can define a vector space V,, as the tensor product of these subspaces in L’(R’): 


V,,=V_ QV: (2.30) 
If (t,) is a one-dimensional scaling function of V,’, then we can derive the 


corresponding separable two-dimensional scaling function as: 


13 











X,() 





X, 
do) aa 


Figure 2.4 One-Step Reconstruction Algorithm. 


(t,,t,)=0(t,)0(t,). (2.31) 


Similarly, two-dimensional wavelets can be obtained as: 


¥ \(t, ty) =H(t,)H(t,) F) (2 .32) 
P(t, st.)= p(t, )O(t,) > (2 33) 
P(t,,t,)=w(t, w(t). (2.34) 


These functions are orthogonal to each other (follows from the orthogonality of one- 
dimensional components). Table 2.1 summarizes the two-dimensional basis functions and 


coefficients as well as their frequency characteristics. [Ref. 1, 3] 





Table 2.1 SUMMARY OF BASIS FUNCTIONS. 


Basis Coefficients Frequency 
Functions Characteristics 


Vertical Lowpass 
Honzontal Lowpass 


d',,(n, ny) Vertical Lowpass 
Horizontal Highpass 


d’,,(n, .N,) 






X(N; ,15) 














Vertical Highpass 
Horizontal Lowpass 


a(n, ,n,) Vertical Highpass 
Honizontal Highpass 


a. Decomposition 
We can obtain the approximation, x,,(,,n,), and wavelet coefficients, 
d,,(n,,ny), i=1,2,3, by convolving the approximation coefficients of higher resolution 


level by the horizontal and vertical scaling and wavelet filters: 


mes) =) » x, (k,Dh,(l-2n,)h,(k-2n,), (2.35) 
dy.) =2¥" Dy x,(kI)h,(1-2n,)g (k-2n,), (2.36) 
dy. M)=29 y x, (k,Dg,(1-2n,)h,(k-2n,), (2.37) 
dn. t)=2), yy x, (k,Dg,(l-2n,)g (k-2n,). (2.38) 


The procedure for 2-D decomposition is depicted in Fig. 2.5. Here, we have two levels 


of 1-D decomposition: one is convolution with a horizontal basis filter in the horizontal 


15 





direction (for example, rows of an image) and the second is convolution with a vertical 


basis filter in the vertical direction (for example, columns of an image). 


b. Reconstruction 
The reconstruction algorithm follows the same procedure of matrix 
multiplication in horizontal and vertical directions as in the decomposition scheme (see 


Fig. 2.6). In analogous to Eq. 2.25 and 2.27, we have: 


2(H,H/H,H,+G_H,H,G,+H,G, G,H,+G, G,G,G,}=1. (2.39) 


The reconstruction algorithm in Fig. 2.6 can be expressed in matrix notation as: 


x, =2H Hy x,,,,.+2G, Hd... +2H, G,d...+2G,'G,d,..,. (2.40) 


Substituting Eq. 2.35 through 2.38 into Eq. 2.40 and using the property of Eq. 2.39, we 


can show that: 


2H Hy x, .,+2G/ Hyd, ,+2H, G, de ,,+2G,G,d,., 


m-+l 
-2H7H,[H,H,x,]+2G, Hy 1H,G xq] (2.41) 
+2H'GJ1G,H,x,1+2G, G,1G,G,x,] 


=x mm 

The quality of reconstruction in both the 1-D ard 2-D cases depends on 
the precision of the filter coefficients. If we use the Haar filter (h=/0.5 0.5]), we do not 
introduce any reconstruction errors. For other filters, such as Daubechies filters, it is 
possible to have some errors in the reconstructed signal because these filter coefficients 


require infinite precision [Ref. 2]. In practice, however, the resulting error is still 


insignificant (around 10°'° for our signals). 
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HORIZONTAL VERTICAL 


Figure 2.5 Decomposition Algorithm of 2-D DWT. 





VERTICAL HORIZONTAL 


Figure 2.6 Reconstruction Algorithm of 2-D DWT. 
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D. BIORTHOGONAL WAVELETS 

We prefer to use short filters for fast computation and because fewer redundant 
samples are caused by the convolution. Short filters do not provide adequate smoothness 
however. For image analysis, linear phase filters are preferred [Ref. 4]. In order to have 
filters of arbitrary length with linear phase, we have to sacrifice orthogonality [Ref. 4]. 
The solution to this problem is to use filters based on a special class of functions called 
biorthogonal bases. 

Biorthogonality requires use of different filters for analysis and synthesis operations 
with the decomposition and reconstruction algorithms described in Section B of this 
chapter. Figure 2.7 shows a one step decomposition and reconstruction scheme using 
biorthogonal filters. The conditions for perfect reconstruction are satisfied by imposing 
orthogonality separately over the decomposition and reconstruction stages [Ref. 4]. Using 


the matrix notation we have used in Eq.2.21, we have: 


H,G,=G,H,=9, (2.42) 


2H,H,=2G,G,=1. (2.43) 


Given the analysis and synthesis scaling filters, h,(n) and h,(n), the corresponding 


wavelet filters are expressed as: 
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Figure 2.7 DWT with Biorthogonal Wavelets. 
8 (1) =(-1)"h, (1-7), (2.44) 


g,(n) =(-1)"hg(1-n). (2.45) 
Notice the similarity between Eq. 2.25 and Eq. 2.42, and Eq. 2.22 and Eq. 2.43 (in the 
orthogonal case H,=H,' and G,=G,’). Finally, perfect reconstruction in biorthogonal 
bases requires: 

2[H,H, +G,G,] =1. (2.46) 


The biorthogonal DWT algorithm for one-dimensional signals can be easily 
extended to the two-dimensional case by using biorthogonal filters in the vertical and 


horizontal directions. 
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IN. VECTOR QUANTIZATION 


A. INTRODUCTION 

Vector quantization (VQ) is generalization of scalar quantization where a scalar 
value is represented by one of several discrete levels. In vector quantization a vector 
quantity is represented by one of several other fixed vectors which are close to the 
original vector in some sense. According to Shannon’s rate distortion theory, better 
results are always obtained when vectors instead of scalars are encoded [Ref. 15]. As the 
vector dimension becomes large, VQ can approach the rate distortion limit for the 
problem. In typical applications, bit rates of lower than one bit per sample are achievable 


with vector quantization; these rates are unlikely for scalar quantization. 


B. QUANTIZER DESIGN 

Vector quantizers map vectors in a multidimensional space into a finite set of 
reproduction vectors called the codebook. The codebook is then used to encode and 
decode the data set. Figure 3.1 shows a schematic of a basic vector quantizer. The 
encoder quantizes blocks of the discrete time signal x(n) by using the codebook 
constructed previously and encodes it into a sequence of bits {b,}, which is then 
transmitted through the transmission channel or stored in some storage medium. If there 
are no channel errors, then {b,"}={b,}. At the receiver end, a decoder converts the 


sequence of bits {b,’} into a vector representing blocks of data and thus into the signal, 
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Figure 3.1 Basic Structure of Vector Quantization. 


s(n). The output s(n) is the reconstructed signal, which will be an approximation of the 
input signal x(n). 

Given the basic structure of VQ, we now proceed to discuss the details of VQ. 
Assume that x={x,, x, . . ., XyJ’ is an N-dimensional vector whose components 
{x,, 1<k<N} are real and continuous (amplitude) random variables. Specifically, the x, 
represent data samples in a block of data from the signal x(n). A vector quantizer maps 
x into another real-valued, discrete-amplitude, N-dimensional vector y. This operation 


can be described as: 


y=q(x), GB.) 


where q(.) is the quantization operato:. The reproduction vector y is chosen from a set 


of vectors M={y,, 1<i<L}, where y,=[y,,, yi, . ~~» Vivi’. The set M is called the 


codebook, where L is the size of the codebook. The quantity 








R=log,L (3.2) 


is the rate of the quantizer in bits per vector while r=R/N is the rate in bits per sample 
( or bits per pixel if the input is image data). To design a codebook, we partition the N- 
dimensional space of input vectors x into L cells {C,, 1<i<L} and associate a vector y, 
with each cell C;. Then we use the code vector y, to represent vectors that fall within cell 


C,, that is: 
q(x) =y;, if xEC,. (3.3) 


The vector quantizer is said to be a minimum-distortion quantizer if some measure 
of distortion is minimized over all code vectors. The measure of distortion is defined in 
terms of a distortion function, d(x,y), which represents the cost of representing any input 
vector x by a code vector y. Although there are many possible distortion measures (Refs. 


6, 8], we choose the mean-square error (MSE): 


N 
d(x,y) = Ea > (x, ~y,) (3.4) 
Nii 


due to its simplicity and mathematical tractability. This distortion measure is simply the 
squared Euclidean distance between x and y. 
To design an optimal (minimum-distortion) quantizer using this distortion function, 


there are two necessary conditions [Ref. 7]. First, the quantizer must satisfy the nearest 


neighbor condition: 














q(x) =y;, iff d(x,y,)sd(x,y,), j+i, 1sjsL. (3.5) 


That is, for each i, all input values that are closer to code vector y, than to any other 


code vector should be assigned to cell C,. Therefore: 


d(x,q(x)) oman): (3.6) 
YE 


Secondly, the code vectors y; must satisfy the centroid condition; each code vector y, is 


chosen such that y, minimizes the MSE in cell C,. In this case, it is easily shown that: 


y, =cent(C,). (3.7) 


K 
cent(C,) = 15 x, 
Kin 


for all of the vectors {x,, kK=1, 2, . . . , K} contained in cell C,. 

Based on the conditions defined above, an iterative procedure can be defined to 
optimize the quantizer starting from an arbitrary set of code vectors. The iteration takes 
place until convergence is obtained. The algorithm used is the generalized Lloyd (GL) 
algorithm [Ref. 8] also known as the LBG (Linde,Buzo,Gray) algorithm [Ref. 9]. The 
algorithm is detailed as follows : 

@ Step 1 (Initialization): Set m=1. Choose an initial codebook M,. 


© Step 2 (Classification): Given the codebook M,, classify the set of training vectors 
{x,, 1<k<L} into cells C, according to the nearest neighbor condition. 


@ Step 3 (Updating): m<m+1. Update the code vector y; of each cell C, by the 
centroid condition (see Eq. 3.7). 








@ Step 4 (Termination Test): Compute MSE for M,,,,. If it has changed by an 
amount less than a prespecified value, stop; otherwise, go to Step 2. 


A flow chart for this algorithm is shown in Fig. 3.2. 

The algorithm above requires an initial codebook for the first step. There is a 
variety of techniques available for generating the initial codebook (see Refs. 6 and 8 for 
different methods). In this work, we choose the random coding technique. Given a set 
of training vectors, this method randomly selects L code vectors. Once the initial code 
vectors are selected, the GL algorithm is performed to improve the codebook. If an 
initial code vector y corresponds to a cell where there are no training vectors, the empty 
cell problem arises. In this thesis we assume that a cell is empty if it has three or fewer 
training vectors. If an empty cell arises, we split the code vector with the highest number 
of training vectors into two code vectors for that particular cell and repartition the cell 
into two cells. The empty cell is then discarded. This algorithm forms Step 2 
(classification) of the GL iteration (see Fig. 3.3). The criterion to terminate the iteration 


in Step 4 (Figure 3.2) is based on the change in MSE, which is computed as: 


D_-D 
A(MSE) = ——™" . (3 9) 


where D,,=d(x,y) for iteration m. 
The quantization procedure described here is known as full search VQ (FSVQ) 
since all code vectors are tested for quantizing each input vector. Other methods of VQ 


exist (see Refs. 6 and 8), but were not explored in this thesis. 


24 





Initialization _ 
m=] 









| (Lloyd Iteration): 


oan 


Compute 


MSE 


am 


Termination 
Test 


Stop 








Figure 3.2 Flow Chart for Generalized Lloyd Algorithm. 
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Figure 3.3. Modification of GL Algorithm to Avoid the Empty Cell Problem. 





IV. ALGORITHM DEVELOPMENT FOR SPEECH DATA AND ANALYSIS 


A. INTRODUCTION 

The previous two chapters provide an overview of discrete wavelet transform and 
vector quantization. In this chapter we combine these concepts to develop algorithms for 
speech waveform compression. We also include a test case for transient signal 
compression in the last section. 

The method used here is known as analysis-synthesis coding. It has been commonly 
used in subband coding and other transform coding techniques [Ref. 8, 16]. This method 
analyzes the signal into components that in some sense provide a more primitive 
representation of the signal, and quantizes these components. The quantized components 
are then used to synthesize a reproduction of the original signal. Figure 4.1 illustrates 
the basic steps of this algorithm. Notice that Figure 4.1 is a modified version of Fig. 3.1, 
where we directly quantize the original signal without decomposing it into components. 

In this study, the DWT is used for analysis-synthesis operations while vector 
quantization is the method chosen for encoding the wavelet coefficients. The DWT 
provides a set of wavelet coefficients for each resolution level. Each set of coefficients 
has different statistical characteristics, and by designing a quantizer for each coefficient 
set optimally, the quantization error can be reduced. This error reduction at each level 


results in an improvement in the quality of reproduction of the original data. 
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Figure 4.1 Analysis-Synthesis Coding Scheme. 


In the following sections we develop algorithms for speech data based on the 
analysis-synthesis coding scheme. We also present the experimental results of the 


algorithm and comparisons of this method with direct vector quantization (DVQ). 


B. ALGORITHM DEVELOPMENT 

The analysis stage for speech data consists of a four-level DWT decomposition. 
Here, DWT decomposes the speech data into four sets of wavelet coefficients, {d.,, 
m=1,..., 4}, and one set of approximation coefficients, x, defined with Eq. 2.23 and 
Eq. 2.19, respectively. From this point on we call the combined set of coefficients {x,, 


and d,, m=1,..., 4} the wavelet coefficients for convenience. Figure 4.2 shows the 
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Frequency 


Figure 4.2 Division of Frequency Domain for DWT. 


division of the frequency spectrum for each resolution level. At each resolution level, the 
lower band is divided into low and high frequency bands and subsampled in the time 
domain by two as explained in Chapter II, Section C. The scaling filter, h(n), that we 
have chosen for the DWT is Daubechies 10-tap filter listed in Ref. 2. The wavelet filter, 
g(n), is derived from the scaling filter by Eq. 2.12. 

The next step after obtaining the wavelet coefficients is to construct the codebooks 
for each set. The sizes of the codebooks are determined by the number of bits allocated 
for each codebook. The bit allocation is a major concern in the design of a coding 
system. It consists of distributing a given quota of bits to various quantizers in order to 


optimize the overall coder performance [Ref. 8]. 
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1. Optimum Bit Allocation 
Fewer bits can be allocated to the high frequency components than to the low 
frequency components since the corresponding coefficients usually have a smaller 
variance. It is also possible to discard all coefficients at a given resolution level, m, if 
the normalized energy: 
¥ 4,(n) 


E(d_)=— (4.1) 


yn) 





in that level is less than a predetermined threshold value. The threshold value is 
determined such that a coefficient set with low energy level does not have a significant 
contribution to the reproduction of the original data. Discarding some coefficient sets by 
this approach allows us to allocate more bits to encode the coefficients at other levels 
with significant energy values. 

While the energy E(d,,) is used as a threshold to select the desired wavelet 
coefficient sets, the variance of the corresponding wavelet coefficients is used in bit 
allocation calculations. Let B represent the desired average number of bits per vector, 
and g,,” be the variance of the wavelet coefficients at resolution level m. Then the optimal 


bit assignment is given by [Ref. 8]: 


1 
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K 
e=(T] o,,)" (4.3) 
m=1 


is the geometric mean of the variances of the wavelet coefficients, and the parameter W,, 
is defined as the ratio of the number of coefficients at resolution level m, L,,, to the 


number of samples in the original signal, L,: 


L 
W,=—". (4.4) 
L, 


The number of coefficients, L,,, at resolution level m is ziven as: 


L |r tl (4.5) 
si 2 
where F is the filter length, and |.] represents the floor operation. 

In practice, this bit allocation method does not guarantee integer bit 
assignments; moreover, it does not exclude the possibility of negative bit allocation. We 


eliminate the negative bit allocation problem by replacing negative values by zero. 


Noninteger bit assignments are adjusted to the nearest integer. 


2. Vector Quantization of Wavelet Coefficients 
The energy distribution of speech data over the wavelet coefficients has 
certain similarities. For a four-level DWT decomposition, the energy is typically 
concentrated in the wavelet coefficients at resolution levels 3 and 4, d, and d,, while the 


wavelet coefficients d, and x, have relatively less energy. Figure 4.3 shows the energy 
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Figure 4.3 Energies of Wavelet Coefficients. 

distribution of wavelet coefficients of a sample speech data. As the energy levels of d, 
and x, are low compared to those of d; and d,, we discard the wavelet coefficients d, and 
x, by replacing them with zeros. (The threshold was set at E(d,,)=0.06.) The quality of 
the reproduced speech indicates that these coefficients (d, and x,) do not have a 
significant contribution in the reproduction. 

Having determined the coefficients to encode, there is another problem to be 
resolved: the small size of training sets at lower resolution levels. Equation 4.5 indicates 
that the total number of samples decreases by approximately a factor of two with each 
resolution level; furthermore, this implies that the size of the training set for that 
particular resolution level shrinks accordingly. One way to compensate for this effect is 


to use a large number of samples in the original data. However, in some cases, such 
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large data sets may not be available. Instead, we use the following technique to extend 
the data sets at low resolution levels. Consider the resolution level m, where we have 
roughly half the number of samples as at resolution level m-/] after carrying out the 


DWT. We now shift the data at level m-] by one sample and repeat the DWT 


decomposition for the shifted data. This gives us a second set of coefficients for 


resolution level m. By combining the two sets of coefficients, we double the number of 
coefficients at resolution level m. The resulting increase in the training set size offsets 
the decrease caused by subsampling. 

The next step is to construct the codebooks by using the training data sets 
obtained by the procedure outlined above. The sizes of codebooks are determined by the 
bit allocation algorithm described in Eq. 4.2. The wavelet coefficients at selected 
resolution levels are then encoded by using the corresponding codebooks. In order to 
reconstruct the original data, we decode the wavelet coefficients by using the same 
codebooks and take the inverse DWT. Here, we substitute zeros for the discarded 


wavelet coefficients. 


C. SIMULATION RESULTS 
In this section we study the performance of the wavelet transform coding (WTC) 
using speech data. Simulations wer carried out by using two data sets: 


@ A training sequence of 104100 samples of ordinary speech from three different 
speakers sampled at 8 kHz. 


@ A test sequence of 20330 samples from a speaker not in the training sequence. 





The test sequence was also encoded by DVQ for comparison. We kept the average bit 
allocation at B=2 bits per sample (bps) (see Eq. 4.2) and changed the vector size to 
obtain different bit rates. The codebook sizes for wavelet coefficients are listed in 
Table 4.1. The distortion criterion defined in Eq. 3.9 for the GL algorithm was set at 
0.01. The performance of the quantization was measured by using a signal-to-noise ratio 


(SNR) defined as: 


2 
SNR=10log,,~% dB (4.6) 





where E(x°) is the signal power and D is the mean-square error between the original and 
reconstructed signal. The signal power is the estimated variance, o,’, of signal samples 
assuming that the input has zero mean. 


Table 4.1 CODEBOOK SIZES FOR WAVELET COEFFICIENTS 









Wavelet Coefficients 


| Wavelet Coefficients 
J oa | a | a | a | 


Codebook 
Size 


The performance of WTC in terms of SNR vs. bit-rate (bits per sample) is 





plotted in Fig. 4.4 as the solid line. The dotted line represents the performance of DVQ. 
Clearly, WTC ts superior to DVQ with a SNR improvement of between 19% and 55%, 
from 0.18 bps to 1.0 bps. Figure 4.5 shows a segment of original test sequence. 
Figure 4.6 shows an encoded example with a bit rate of 0.18 bps and an SNR of 3dB. 


The overall quality is fairly good and the speech is understandable. The same speech 
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Figure 4.4 SNR vs. Bit Rate: Speech Data. The solid line represents WTC while the 
dotted line represents DVQ. 


segment encoded by DVQ is shown in Fig. 4.7. The bit rate was 0.18 bps and SNR was 
2.5dB. Although the SNR values of WTC and DVQ are fairly close, the smoothing 
capability of WTC renders the speech more understandable. Figures 4.8-4.9 show the 
reproduced speech segments encoded at bit rates of 1.03 bps for WTC and 1 bps for 
DVQ. Table 4.2 summarizes the bit rates and the contribution of each wavelet coefficient 


set to the overall distortion. 
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Figure 4.5 Original Test Speech. 
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Figure 4.6 Test Speech Coded by WTC at 0.18 bps, SNR = 3.0 GB. 
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Figure 4.7 Test Speech Coded by DVQ at 0.18 bps, SNR = 2.5 dB. 
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Figure 4.8 Test Speech Coded by WTC at 1.03 bps, SNR = 9.56 dB. 
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Figure 4.9 Test Speech Coded by DVQ at 1.0 bps, SNR = 6.2 dB. 
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Table 4.2 BIT ALLOCATIONS AND MSE CONTRIBUTIONS OF WAVELET 
COEFFICIENTS FOR AN AVERAGE BIT RATE OF 1.03 BPS. 











Wavelet Coefficients 
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Interestingly, given an average number of bits, B, the performance of WTC 






increases rapidly over the performance of DVQ when we use smaller vector dimensions. 
(Recall that we use larger vectors to obtain small bit rates in Fig. 4.4.) This is caused 
by the effect of correlation between samples. At lower levels, the wavelet coefficients 
become less correlated and result in less degradation of quality for small vector 
dimensions. 

We have, additionally, tested the WTC algorithm for coding transient signals. For 
the simulation study, a transient signal of 2400 samples was utilized. Since the energy 
levels of the wavelet coefficients, x, and d,, were low compared to those of d,, d, and 
d;, a three-level DWT decomposition was chosen to analyze the original signal. Among 
the four sets of wavelet coefficients (x, and d,, i=1, 2, 3), x; was discarded due to its 
low energy level (see Table 4.3), and the bit allocation algorithm in Eq. 4.2 was applied 
to the coefficients, d,, d, and d,, keeping the average number of bits at B = 2. Table 4.3 
shows the codebook sizes for the different resolution levels as well as the energy levels 


of the wavelet coefficients at those resolution levels. 


38 








Table 4.3 ENERGY LEVELS AND CODEBOOK SIZES OF WAVELET 
COEFFICIENTS OF TRANSIENT SIGNAL 







Wavelet Coefficients 


0.0690 0.8152 0.1030 0.0140 
Codebook sie] 2 | te | ts 


As in the case of speech waveform coding, the test signal was coded by both WTC 





and DVQ to allow good comparisons. The vector dimensions were changed to achieve 
varying bit-rates in terms of bits per sample. The original signal is shown in Fig. 4.10. 
Figure 4.11 shows the signal coded by WTC at 1.0 bps. The same test signa] coded by 
DVQ at 1.0 bps is displayed in Fig. 4.12. The SNR of the WTC-coded signal was 7.1 
dB while the SNR in case of DVQ was only 3.9 dB. The WTC reproduces the transient 
signal better because it decomposes the original signal into different resolution levels 
corresponding to different frequency bands. Also, the DVQ does not contain a sufficient 
number of code vectors to represent the changing structure of the transient signal well. 
The improvement realized by WTC is confirmed by the SNR performance displayed in 
Fig. 4.13. 

For the two test cases (speech waveform and transient signal coding), the 
improvement in the performance of WTC over DVQ in terms of SNR varied from 19 to 
84 percent. As stated before, this advantage is a result of utilizing different codebooks 
to quantize the wavelet coefficients at different resolution levels and also of discarding 


components that do not contribute to the reproduction of the original signal. 
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Figure 4.10 The Original Transient Signal. 
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Figure 4.11 Transient Signal Coded by WTC at 1.0 bps, SNR = 7.1 dB. 
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Figure 4.12 Transient Signal Coded by DVQ at 1.0 bps, SNR = 3.9 GB. 
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Figure 4.13 SNR vs. Bit Rate: Transient Signal. The Solid Line Represents the 
WTC, and the Dotted Line Represents DVQ. 
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V. ALGORITHM DEVELOPMENT FOR IMAGE DATA AND ANALYSIS 


A. INTRODUCTION 

In this chapter we extend the algorithm developed in the previous chapter to the 
two-dimensional case. The algorithm consists of the same operations illustrated in 
Fig. 4.1. The following sections provide a detailed overview of the algorithm and the 


simulation results. 


B. ALGORITHM DEVELOPMENT 

The analysis step of the proposed coding method includes a three-level DWT 
decomposition in two-dimension as illustrated in Fig. 5.1. A three-level instead of a four- 
level decomposition as used in the speech waveform coding was chosen because the 
correlation between coefficients of an image at lower resolution levels decreases 
significantly, and this makes encoding the wavelet coefficients at those levels extremely 
difficult. Table 5.1 shows the variances of the wavelet and approximation coefficients at 
different resolution levels. Notice the increase in variance of x, at the fourth resolution 
level. One way to compensate the effect of increase in variance would be to use more 
bits in that particular resolution level. But, using more bits increases the computational 
cost involved in generating the codebook and finding the closest codeword for each 


vector; thus, we stop the decomposition at the third resolution level. 
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Figure 5.1 Three-level DWT Decomposition. 


Table 5.1 VARIANCES OF WAVELET COEFFICIENTS (10%) 


Resolution Levels 


116.6700 
4.2170 






As we mentioned in Section D of Chapter I, we would like to have short wavelet 


filters, and we also prefer to use filters with linear phase characteristics. By relaxing the 


orthogonality and choosing biorthogonal basis functions, it 1s possible to use linear phase 
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filters. Throughout the following experiments, we used the 7-tap biorthogonal filters 
listed in Table 5.2 which are short and have linear phase. 


Table 5.2. 7-TAP BIORTHOGONAL FILTER COEFFICIENTS 


-0.053571428571 
-0.010714285714 





1. Optimum Bit Allocation 
The three-level DWT provides 10 sets of wavelet coefficients {x,, d,,, m, i 
= 1, 2, 3} for quantization. Figure 5.2 shows the average energy distribution of wavelet 
coefficients of sample images; the energy of x, is not shown to scale in order to make 
other energy bars visible. (The actual value for x; is shown on top of the corresponding 
bar.) As can be seen, most of the energy is concentrated in x,; the energies contained in 


the other coefficients are much smaller. Using the bit allocation scheme defined in the 





previous chapter leads to a problem because of this uneven energy distribution and the 
high variance of x, (See Table 5.1); all bits are assigned to wavelet coefficients x, leaving 
no bits to represent the other wavelet coefficients. If we discard the other coefficients and 
encode only x,, this causes another problem. The edge information of the image is mostly 
contained in these coefficients, and the edges in the image are lost if we discard all of 
these coefficients. In order to allocate some bits to quantize d,,, we first reserve a large 


number of bits for x, taking the computational cost into account and share the remaining 
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Figure 5.2 Energy Distribution of Wavelet Coefficients Normalized to the Original 
Signal at Resolution Level m=O. 


hits among d’,, by using the bit allocation algorithm of Eq. 4.2. The wavelet coefficients 
d',, and a’, are discarded due to their low energy values. Keeping the average number 
of bits at B=/ bit per vector, we reserve 11 bits for coding of x, and allocate the 
remaining bits to the rest of the wavelet coefficients. Table 5.3 shows the bit allocations 
for all wavelet coefficients. 


Table 5.3 BIT ASSIGNMENTS TO WAVELET COEFFICIENTS 






Wavelet Coefficients 





Codebook Size 
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2. Vector Quantization of Wavelet Coefficients 

Given the number of bits for each set of wavelet coefficients, we construct 
the codebooks by using the GL algorithm. In order to compensate for the decrease in the 
number of coefficients at lower resolution levels, we use the same approach introduced 
in Section C.2 of Chapter IV. Consider the signal at resolution level m which has 
roughly one quarter of the samples of that at resolution level m-1]. First, we decompose 
the level m-J signal into wavelet coefficients. The resulting four wavelet coefficient sets 
at level mm are retained. Now, we shift the level m-] data by one sample horizontally and 
repeat the DWT decomposition. The resulting wavelet coefficients are combined with 
those already obtained at level m to extend the training set. This same procedure is 
repeated using vertical and diagonal shifts. The resulting data set at level m has roughly 
the same size as that at level m-J. 

After constructing the codebooks using the training data obtained as above, 
we encode the wavelet coefficients using the corresponding codebooks. In the 


reconstruction procedure, we substitute zeros for the discarded wavelet coefficients. 


C. SIMULATION RESULTS 
Simulations were carried out by using two images: 


@ A training image of 512-by-512 pixels (Fig. 5.3) constructed from pieces of four 
different images with a resolution of 8 bits per pixel (bpp). 


@ A test image (Lenna) of 512-by-512 pixels shown in Fig. 5.4. 


The test image was not contained in the training data. 
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The coding performance is measured by using the peak signal-to-noise ratio (PSNR) 
defined as the ratio of square of the peak input amplitude to the mean square error, D, 
(Ref. 8}: 
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) aB (5.1) 
D 


PSNR = 10log,,( 





where: 


and N is the number of pixels along one side of the (square) image. 

Figure 5.5 shows the performance of WTC in terms of SNR vs. bit-rate, in bits per 
pixel. We changed the number of pixels in each vector to obtain different bit rates while 
keeping the number of bits for each quantizer fixed as in Table 5.3. The test image was 
also coded by DVQ for comparison. The dotted line in Fig. 5.5 represents the 
performance of DVQ. 

Figure 5.6 shows the WTC-coded test image at 0.25 bpp with an SNR of 27.37 dB 
while Figure 5.7 shows the same image coded by DVQ at 0.25 bpp with an SNR of 20.7 
dB. The performance of WTC is very good for a compression rate of 32:1; however, the 
picture quality is effected by blurred and jagged diagonal edges. This is caused by the 
smoothing effect of the DWT, which produces a positive effect in speech coding. Using 
vectors of small size helps to preserve the edges; Figure 5.8 shows the test image coded 
hy WTC with a vector size of 2 pixels (0.5 bpp) and an SNR of 29.5 dB. The quality 


of the coded image is nearly as good as the original. 
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Figure 5.3 Training Image of 512x512. 





Figure 5.4 Test Image (Lenna) of 512x512. 
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Figure 5.5 SNR vs. Bit Rate: Image Data. The solid line represents WTC while the 
dotted line represents DVQ. 
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Table 5.4 illustrates the contribution of each wavelet coefficient set of Figure 5.8 
to the overall distortion with the bit rates normalized by W,, (see Eq. 4.2 and 4.4). It is 
interesting that the contributions of various wavelet coefficients to the overall distortion 
are by no means equal. It ranges between MSE = 8.24 for d’, and MSE = 172.52 for 
x;. Notice the high distortion in x; despite the high number of bits per pixel used in the 
allocation. This underscores the importance of allocating a large number of bits for the 
codebook of x;. 


Table 5.4 BIT ALLOCATIONS AND MSE CONTRIBUTIONS OF WAVELET 


COEFFICIENTS OF THE IMAGE IN FIG. 5.8 

pa we ee ae 
Tairaeoe| 0 [0 | 0 | oon | on 
Css | as7e [ior | ame [erat | mat | 
He er ae 
Taicnate ope | 0 | oars | oor | ons | areas | 
Pse [a0 | nos | ona | usm | ims 


We also coded the test image by using different vector sizes. Since most of the 











energy was contained in x;, we used a vector dimension of 2 pixels for x; and 8 pixels 
in each vector for the d coefficients. The codebook sizes were kept the same as in Table 
5.3. The resulting bit rate was 0.2 bpp with an SNR of 27.15 dB. The reconstructed 
image is shown in Fig. 5.9. A comparison of Fig. 5.6 and Fig. 5.9 shows that proper 
use of different vector sizes can result in lower bit rates with better visual quality. We 
say "better visual quality” instead of “better SNR" because the quality of the image in 


Fig. 5.9 looks superior to the one in Fig.5.6 although the SNRs are quite close. 
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Figure 5.6 Lenna Coded by WTC at Figure 5.7 Lenna Coded by DVQ at 
0.25 bpp, SNR = 27.37 dB. 0.25 bpp, SNR = 20.7 dB. 





Figure 5.8 Lenna Coded by WTC at Figure 5.9 Lenna Coded by WTC at 
0.5 bpp, SNR = 29.5 dB. 0.2 bpp by Using Different Vector 
Sizes, SNR = 27.42 dB. 
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For different simulations, the improvement in performance of WTC over DVQ in 
terms of SNR ranged from 17 to 50 percent corresponding to bit rates of between 0.16 
and 0.5 bpp. The use of different vector sizes for different wavelet coefficients provided 
better image quality compared to the case of uniform vector size for all wavelet 


coefficients. 
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VI. CONCLUSIONS 


In this thesis, we combined algorithms for discrete wavelet transforms and vector 
quantization to develop coding schemes for speech and image data. We studied wavelets 
based on both orthogonal and biorthogonal bases; we used biorthogonal wavelet functions 
for image coding since filters with linear phase are desired in image coding. 

The algorithm developed for the discrete wavelet transform (DWT) has a pyramidal 
decomposition architecture and is based on the convolution of data in each pyramid level 
with wavelet and scaling filters. Unlike those in traditional Fourier theory, the basis 
functions in the wavelet approach are formed by dilating and translating a single 
prototype function. Given the coefficients of basis filters with infinite precision, the DWT 
reconstructs the original data without any error. 

Vector quantization was shown here to yield better performance when combined 
with the DWT. The results as measured in SNR show that the performance has improved 
by 17% to 84% over direct vector quantization (DVQ) depending on the number of bits 
per sample and the type of data. In speech coding, we found that the subjective quality 
of the coded speech was still superior to that using DVQ, even though the improvement 
in SNR obtained by using the DWT was fairly small. This shows that SNR may not 
always be a good measure of performance. We also tested the same algorithm on 


transient signals. The performance of WTC in this case well exceeded that of DVQ. 
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In image coding, we found that WTC outperforms DVQ for bit-rates around 0.5 
bit per pixel (bpp). At bit-rates lower than 0.5 bpp, WTC suffered from blurred and 
jagged edges although the performance was still better than DVQ. This problem also 
existed for relatively high bit-rates, but it was not as predominant as it was at low bit- 
rates. 

The major coding gains are due to encoding vectors of samples (vector 
quantization) and the use of different codebooks for wavelet coefficients at different 
levels. Although using different codebooks increases the complexity and the 
computational cost, this research has shown thai the VQ-DWT combination improves the 


quality of the reconstructed data over that of DVQ for a given bit rate (bps or bpp). 
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APPENDIX A: PROGRAM DETAILS 


This appendix contains the program listings for each of the MATLAB algorithms 
in the thesis. The codes are categorized into three classifications, |-D DWT, 2-D DWT 


and VQ. 


A. ONE-DIMENSIONAL DWT ROUTINES 


function lowest =dwti(c0,qqq,LOW., file,qq) 


% DWTl One-dimensional! discrete wavelet transform 

% DWTI(...) computes the wavelet coefficients of vector data CO 
% by using a wavelet filter. It stores the wavelet coefficients 

% in a file and returns the lowest resolution level that the data 
% is processed. It also stores the filter banks to avoid 

% rebuilding them in reconstruction process. 

% CO = Input data in vector format 

% QQQ = Wavelet filter type 

% LOW = Lowest desired resolution level 

% FILE = Output file containing the wavelet coefficients 
% QQ = Filter tap 


% Alper Erdemir 
% June 1993 


c0=c0'; 


% Get the filter coeffients 


h=coiflet(qq)’: 

wwavelet =["Coif_*,num2str(qq)]; 
elseif qqq= =2, 

h=daubdata(qq); 

wwavelet =["Daub_*,num2str(qq)]; 
else h=[.5 .5]';wwavelet="HAAR’; 
end 


% Normalize the filter coefficients 


h=sqrt(2)*h: 


% \ntialize some of the constants: 





LL = length(c0); 

Nh = length(h): 

lowest = -ceil(log(LL)/log(2)): 

if abs(lowest) > abs(LOW) lowest =-abs(LOW); end 


% Generate the g coefficients 


g = flipud(h), 

for 1=2:2:Nb 
g(t.l) = -gQ.l); 

end 


WIN =256; % window size for data longer than WIN 


% The following code constructs the filter banks for the scaling 
% filter and the wavelet filter 


Of race usawodindsatadescsucevacs. sduceveemetsecceowce acosoaceeswaesse 
if LL < WIN 
H = fltbnk(h,2*ceil(LL/2)); 
G= fitbnk(g,2*ceil(LL/2)); 
else 
H=fltbnk(h, WIN); 
G=fltbnk(g, WIN); 
end 


% Decomposition routine 


for ivi=-1:-1l:lowest 
lvl 
eval([°c’ ,num2str(abs(Ivl)),"=dcemp1",'(c’.... 
num2str(abs(Ivl)-1),",H,Nh, WIN);'}) 
eval(['d’ num2str(abs(Ivl)),"=dcmp1]’,*(c’,... 
num2str(abs(Iv1)-1),’,G,Nh, WIN);'}) 
end 


% Store the wavelet coefficients 


temp=I]: 
for n= 1:abs(lowest) 
eval([‘temp={temp ‘'d’,num2str(n),” c’,int2str(n),” **]:"}) 
end 
eval(|"save *.file,"_’,wwavelet,’ cO H G WIN wwavelet Nh °.temp]) 


% End of routine 


function x = fltbnk(filt,L) 
% FLTBNK Construction of filter matrices for DWT 
% FLTBNK(FILT,L) constructs the filter matrix using the filter FILT. 
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% This routine is also used for 2-D DWT. 
% FILT = Filter 
% L = Length of data to be processed 


% Alper ERDEMIR 
% Sune 1993 


F = length(filt); 
Le =L+2*F-4; 
Ll =(Le-F)/2 + 1: 


x=zeros(Li,Lc); 
for n=1:LI 


x(n,:)=|zeros(1,2*(n-1)) filt” zeros(1,Lce-F-2*(n-1))|; 


end 


% End of routine 


function x =dcmp 1 (data, F,IF,W) 


routine. 


% DCMP! One-dimensional DWT decomposition 

% DCMP\(...) conducts the DWT decomposition of the input DATA 
% by using the filter bank F with a window size of W. 

% DATA = Input data 

% F = Wavelet/Scaling filter matrix 

% IF = Wavelet/Scaling filter length 

% W = window length for relatively long data 


% Alper ERDEMIR 
% June 1993 


L =length(data); 
LF =IF-2; 


% The following four lines checks if the data vector is odd, if it is 


% true, then a zero ts added to it. 


if rem(L/2,floor(L/2)) ~ = 0 
data =[data;0}; 
L=L+1, 
end 
N= fix(L/W); 
% Add LF zeros to the beginning and end of DATA 


data =[zeros(LF,1):data;zeros(LF,1)]: 
% Decomposition routine 


LLL = length(data); 
LL=(LLL-IF)/2+ 1, 


57 








fN >= 1 


else 


end 


LIL =(W + 2*LF-IF)/2 +1; 
x(1:LIL) = FOR:LIL EW + 2") F)*data(i:W+2*LF): 
LI2 =(W +LF-IF)/2 +1; 
if N>=2 
for n=1:N-1 
x(LH + (¢n-1)*L12 + 1:LE +a0*Li2)=FO:LI2,1:W+LF)... 
*data(LF + n*W + 1:(0 + 1)*W+2*LF); 
end 
end 
REM =rem(L,W): 
if REM ~ = 0 
Lrem =(REM + LF-IF)/2 + 1; 
x(LL-Lrem + 1:LL)=F(1:Lrem,1:REM + LF)*data(LLL-REM-LF + 1:LLL); 
end 
X=X°y 


x= F(t:LL,1:LLL)*data; 


% End of routine 


function ses =idwtl (file) 
% {DWT One-dimensional inverse discrete wavelet transform 


% 


IDWT1(FILE) returns the reconstructed data SES by 
taking the inverse wavelet transform of wavelet coefficients 
stored in FILE DWT. 


% Alper ERDEMIR 
% June 1993 


% Lode the wavelet coefficients 


eval({ ‘load °.file,”_dwt']) 


% Start the reconstruction routine 


eval([{‘ses=c*.num2str(lowest).°;"]) 
FF =Nh-2; 

FFF =floor((FF-1)/2+ 1): 

WIN1 = WIN-2*FF; 

for lvl =-lowest:1:-1 


L=length(ses); 
WIN2 = floor((WIN1 + Nh-1)/2); 
N = fix((L-FFF)/WIN2); 
eval({'a=length(d’ num2str(-lvl-1).°):"]) 
if N== 
Lu=(L-1)*2+Nh; 
eval(["dwork =G(1:L.1:Lu)’**d* »num2str(-Ivl),°:"]) 
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ses H(1:L,1:Lu)’*ses + dwork; 
Lp = lengthises)-Nh+ 2; 

ses =ses(Nh-2 + 1:Lp); 

ses = ses(l:a): 


else 
Lu = (WIN2-1)*2 + Nh; 
eval([‘dwork = G(1:WIN2,1:Lu)''*d".... 
num2str(-Ivl). "C1: WIN2):"]) 
ewk(1:Lu) =: HCL: WIN2,1:Lu)'*ses(1:WIN2) + dwork; 
Lp — length(cwk)-FF: 
cwk ~ cwk(Nh-2 + I: Lp): 
NN = lengthicwk): 
ifN>=2 
Lu2=Lu+ FF; 
WIN3 = WIN? + 2*FFF; 
for n=1:N-1 
eval(|"'dwork = G(1: WIN3,1:Lu2)'"*d’,... 
num 2str(-Ivl).°((n-1)*"VIN2-FFF + WIN2 + I:n* WIN2 + FFF + WIN2):’|) 
cwk2 = H(1:WIN3,1:Lu2)**ses(n*... 
WIN2-FFF + 1:(n + 1)*WIN2 + FFF) + dwork; 
Lp2 =length(cwk2)-FF; 
ewk(NN + (n-1)*(Lp2-FF) + 1:NN +n*(Lp2-FF))=cwk2(FF + |:Lp2); 
end 
NN = lengthicwk): 
end 
REM =L-N*'WIN2; 
if REM ~= 0 
Lrem =(REM-1)*2 + Nh; 
eval(| \dwork = G(1t:(FFF + REM),t:Lrem)***d’,... 
num2str(-Ivl),’(L-REM-FFF + 1:L);"}) 
cwk2 =H(1:(REM + FFF),1:Lrem)’*ses(L-REM-FFF + |:L)+dwork; 
Lp = length(cwk2); 
cwk(NN + I:NN + Lp-FF)=cwk2(FF + 1:Lp); 
end 
ses =cwk(l:a)’; 
end 


end 


% End of routine 
Ode SI ES Th es hn Tel eh, 4 aT teh rn ee he 


B. TWO-DIMENSIONAL DWT ROUTINES 


function lowest =dwt2(c0,flt, LOW. file.qq) 


% DWT2 Two-dimensional discrete wavelet transform 

% DWT2(...) computes the wavelet coefficients of two-dimensioral 

% data CO by using a wavelet filter FLT tn both horizantal and vertical 
% directions. It stores the wavelet coefficients in a FILE DWT and 

% returns the lowest resolution level that the data ts processed. 

%e Both orthogonal and biorthogonal decompostion can be done 

% by using appropriate filters. 

% CO - Input data in vector format 
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%e FLT = Wavelet filter type 

% LOW = Lowest desired resolution level 

% FILE = Output file containing the wavelet coefficients 
% QQ — Filter tap 


% Alper Erdemir 
% June 1993 


% Construct 2-D filter banks for decomposition 





|HH.HV.GH,GV.Nh.Nv]= filts2(cO,flt,qq. 1): 


% Determine of the lowest possible resolution level 


[Li L2]=size(cO): 
rsl = min(ceil(log(L 1 )/log(2)).ceil(log(L2)/log(2))); 
lowest = abs(rsl); 
if lowest > LOW 
lowest= LOW: 
end 


% Decomposition routine 


for Ivl= 1: lowest 

eval([‘c’ num2str(lvl),°=demp2(c’,... 
num2str(Ivi-1),’,HH,HV,Nbh,Nv);"]) 
eval({‘dl* num2str(Ivl),"=demp2(c’,... 
oum2str(Ivi-1),’°,HH,.GV.Nh,Nv);"}) 
eval({°d2° num2str(Ivl),° =dcemp2(c’,... 
num2str(Ivi-1),",GH,HV,Nh,Nv);°]) 
eval({'d3° ,num2str(ivl),"=demp2(c*,... 
num2str(IvI-1),°,GH,GV,Nh.Nv):"]) 





end 


% Store the wavelet coefficients 


Off nw toes hn aitactuah oxen oh sace se tony ecesndaS ea tebe a eee 
temp =|]: 
for n= I:lowest 
for m=1:2 
eval(|'‘temp=(temp ‘'d*.int2s.7(m),int2str(n),” **}:"]) 
end 
eval(|‘temp={|temp “"c’.int2str(n),” “"|."]) 
end 


eval(|'save ‘file,’ dwt flt qq °.temp]) 


% End of routine 
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function [HH.HV.GH,GV.Nh.Nv] = filts2(cO,flt.qq.dir) 
% FILTS2 construction of 2-D filter banks 


% FILTS2 returns the 2-D filter matrices using horizantal and 

% vertical wavelet and scaling filters. It also returns the filter 

% lengths. The available filter types are Daubechies |- to 10-tap 

% filters, Haar filter, and biorthogonal |- to 5-tap filters. 

% CO = Input data in matrix format 

% FLT = filter type (for both directions) 

% QQ = Filter tap 

% DIR = Indicator for decomposition and reconstruction procedure 
% (used for biorthogonal filters) 


% Alper ERDEMIR 
% June 1993 


% Get the filter coefficient 


Yo w nnn nnn nn ne ene nnn nn nnn nn nn nn nn nn nn nn nnn nn nn nn nnn nnn en ene ee nnn rene 
if flt == 
Hh = daubdata(qq): 
elseif flt == 2 
Hh=[.5 .S}°: 
elseif flt == 3 
{ Hh, Hh2] = biort(qq): 
Hh =[0;HB}: 
Hh2=[0:Hh2]; 
Hv2 = Hh?2; 
end 
Hv = Hh: % Let vertical filter equal to horizanta! one 


Nh = length( Hh); 
Nv =length(Hv): 


% Generate the g coefficients 


Of, s Seicacecdeiehectece Seelesenetasetatestieso docs S is sabe eett ee 
if flr <= 2 
Gh = flipud( Hh); 
for 1=2:2:Nh 
Gh(i.1)=-Gh(i.1); 
end 
Gv = flipud(Hv); 
for 1=2:2:Nv 
Gv.) =-Gvi.t): 
end 
else 
if dir == t 
Gh = flipud(Hh2), 
for «= 2:2:Nh 
Gh(i.l) = -Ghi,!): 
end 
Gv = flipud(Hv2): 
for 1=2:2:Nv 
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Gva.l) = -Gvai,1); 
end 
else 
Gh = flipud(Hh); 
for 1=2:2:Nb 
Ghii,l) = -Ghai,1); 
end 
Gv = flipud(Hv); 
for 1=2:2:Nv 
Gvti.d) = -Gv(i,}): 
end 
Hh = Hh2;Hv =Hv2: 
end 
end 


|drow0,dcol0| = size(cO): 
% Construct the filter banks 


HH = sqrt(2)*fltbnk( Hh, 2*ceil(dcol0/2)); 
HV =sqrt(2)*fltbnk(Hv ,2*ceil(drow0/2)); 
GH = sqrt(2)*fltbnk(Gh,2*ceil(dcol0/2)); 
GV =sqrt(2)*fltbnk(Gv ,2*ceil(drow0/2)); 


% End of routine 


function x =dcmp2(data, Fh, Fv.!Fb,IFv) 

% DCMP2 Two-dimensional DWT decomposition Routine 

% DCMP2(...) conducts the DWT decomposition of the input DATA 
% by using horizantal and vertical filter banks, FH and FV, 


% DATA = Input data 

% FH = Horizantal wavelet/scaling filter matrix 
% FV = Vertical wavelet/scaling filter matrix 
% IF = Horizantal wavelet/scaling filter length 
% IF = Vertical wavelet/scaling filter length 


% Alper ERDEMIR 
% June 1993 


{Lv Lh| =size(data): 
% \f the horizantal length of the data is odd, add a column of zeros 
if rem(Lh/2,floor(Lh/2)) ~ = 0 

data ={data zeros(Lv,1)]: 

Lh=Lh+1; 


end 


% Add IFh-2 columns to left and rigth of the data 
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data =|zeros(Lv,IFh-2) data zeros(Lv,IFh-2)]; 


vlen = Lh +2*(IFh-2); % Width of the Hor. Filt. Matrix 
hlen=(vlen-IFh)/2+1; % Heigth of the Hor. Filt. Matrix 


% Horizantal decomposition routine 
xh = (Fh(1:hlen, 1: vlen)*data’)’; 
% \f the horizantal length of the data is odd, add a column of zeros 


if rem(Lv/2,floor(Lv/2)) ~ = 0 
xh =[xh:zeros(1.hlen)]: 
Lv=Lv+1; 

end 


% Add \Fh-2 columns to left and rigth of the data 
xh =[zeros(IFv-2. hlen);xh;zeros(iFv-2,hlen)]; 


vien2 = Lv +2*(IFv-2); % Width of the Vert. Filt. Matrix 
hlen2 = (vlen2-IFv)/2 + 1; % Heigth of the Vert. Filt. Matrix 


% Vertical decomposition routine 
x =Fv(1:hlen2,1:vien2)*xh; 


% End of routine 


% function tm = idwt2(file) 

% IDWT2 Two-dimensional inverse discrete wavelet transform 
%e IDWT2(FILE) returns the reconstructed image IM by 

% taking the inverse wavelet transform of wavelet coefficients 
% stored in FILE DWT. 


% Alper ERDEMIR 
% June 1993 


% Load the wavelet coefficients and construct the filter matrices 


%eval({cO=load2(*’’,file,”’*):°]) 
Zeval(|load *.file.” dwt']) 
if ~exist(‘'HH’*) 
(HH,HV.GH,GV,Nh.Nv|= filts2(cO.flt,qq.2); 
end 


% Reconstruction routine 
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eval({‘im =c*,num2stri(lowest),";*}) 
for lvl=lowest:-t:1 
ewrk =retver2(im,HV.Nv); 
eval(| dwrk! =rctver2(d1*,num2str(lvl),°.GV.Nv):'‘}) 
datvl =cwrk +dwrkl; 
eval({’dwrk2 =rctver2(d2° .num2str(tvl).°,HV.Nv);"|) 
eval({‘dwrk3 = rctver2(d3’ ,sum2str(ivi),",GV,Nv);°)) 
datv2 = dwrk2 + dwrk3; 
dath! =rcthor2(datv! ,HH,Nh); 
dath2 = rcthor2(datv2.GH,Nh); 
im =dath! + dath2; 
{Lv Lh] =size(im); 
im =im(Nv-2 + 1:Lv,Nh-2 + 1:Lh); 
if lvi~ =1 
eval(|"fal a2]=size(d!* num2str(Ivl-1).°);"]) 
else [al a2|=size(cO); 
end 
im=im(l:al,1:a2); 
end 


% End of routine 


function x =rcthor2(data, Fh,1Fh) 
% RCTHOR2 conducts the 2-D horizantal reconstruction of 


% wavelet and approximation coefficients. 

% DATA = Input data 

% FH = Horizantal wavelet/scaling filter matnx 
%e LFH = Horizantal wavelet/scaling filter length 


% Alper ERDEMIR 
% June 1993 


[Lv Lh} =size(data); 
vien =(Lh-1)*2+1Fh; 


% do the honzantal decomposition 
x =(Fh(1:Lh.1:vlen)’*data’)’; 


% End of routine 


funct:on x=rctver2(data,Fv.IFv) 
% RCTVER2 conducts the 2-D vertical reconstruction of 


% detail and approximation coefficients. 
% DATA = Input data 
% FV = Vertical wavelet/scaling filter matrix 


% LFH = Vertical wavelet/scaling filter length 


% Alper ERDEMIR 
% June 1993 


[Lv Lh] = size(data): 
vien =(Lv-1)*2 +1Fv; 


% do the vertical decomposition 
x> Fv(i:Lv.t:vien)'*data: 
% End of routine 


C. VQ ROUTINES 


function w =initcb(x,B) 


% \NITCB Initialization of the codebook for VQ 
% INITCB(X,B) initializes the codebook randomly selecting data 


% vectors from the subject data set X. 
% X = Data set 
% B = Total number of bits 


% Alper Erdemir 
% June 1993 


N=2*B; % Find the codebook size 
% Randomly select the codewords from X 
rand(‘uniform’) 
Nx = max(size(x)); 
for 1=1:N 
w(:.4) = x(:.ceil(Nx*rand(1))); 


end 


% End of routine 


function x =dat2vec(y.M) 


% DAT2VEC Conversion of data format to be used in VQ 
% DAT2VEC(Y,M) converts the input data into the VQ format 


% Y = Input data of Nxl (for 1-D) or NxN (for 2-D) where N is 
% data length 
% M = vector/block size 


% Alper ERDEMIR 
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% Sune 1993 


N = size(y); 
al=M(1): 


% if input data is |-D 


if NG) == 1) N(Q2) == 1 
in_hei=length(y); 
ola=ceil(in_hei/nl); 
y=yl); 
x=[y:zeros(nI *nla-in_hei.!)]; 
x =reshape(x.n1.nla): 


% else, input is 2-D 


Opt test Santen neh ites th Ae LP ae eens awe as 2 
else 
n2=M(2); 
nla=ceil(N(1)/nl); 
n2a=ceil(N(2)/n2); 
x=zeros(nl *n2,nla*n2a); 
yy =zeros(nla*ni ,n2a*n2); 
yy(i:N(1),1:N(2))=y; 
for i=I:nla 
il =(i-1)*n1 +1, 
for j=1:n2a 
k=(i-1)*n2a+j; 
} jg) =G-1)*n2 +4; 
z=yy(ilul+ol-1ljlyl+n2-1); 
x(:,k) =2(:); 
end 
end 
end 


% End of routine 


function x = vec2dat(y,N,BLK) 

% VEC2DAT conversion of the data from VQ format to original format 

% VEC2DAT(Y.N.BLK) converts the data Y in VQ format to original 
% format of size NxI (for !-D) or size NxN (for 2-D). 


% Y = Subject data in vector format 
% N= Size of original data 
% BLK = Vector/block size 


% Alper Erdemir 
% June 1993 


% if data is 1-D 
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if nargin = = 
y=y): 
x=y(1:N); 


% else itis 2-D 


% A ds a Soa pnd we Sea oh nas ub aa swat a ogc cain sa eles a da aia nd es cad ca a acs 
else 
nla=ceil(N(1)/BLK(1)); 
n2a=ceil(N(2)/BLK(2)); 
x = zeros(BLK(1)*N(1),BLK(2)*N(2)): 
for i= I1:nla 
il =(1-1)*BLK(1)+1, 
for j=I1:n2a 
k=(1-1)*n2a+)j; 
yl =Q-lD*BLK(2) +1; 
z=zeros(BLK(1),BLK(2)); 
for |= 1:BLK(2) 
m =(1-1)*BLK(1) +1; 
2(:,) =y(m:m + BLK(1)-1.k); 
end 
x(il:ail +BLK(1)-1 jl jl +BLK(2)-1) =z; 
end 
end 
x=x(b:N(1),1E:N(2)); 
end 


% End of routine 


function [w,m]|=Ibg(tr,bit, BLK) 
% LBG conducts the generalized Lloyd's algorithm 
% LBG(TR,BIT.BLK) returns the trained codebook for VQ. The code first 


%e initializes the codebook, then trains it by using Lloyd's iteration. 
%e The training stops when the criteria equals to 0.01. 

% TR = Training data 

%o BIT = Total number of bits to costruct the codebook 

% BLK = Vector/block size 


% Alper Erdemir 
% June 1993 


x =dat2vec(tr.BLK): 

% Initialize the codebook 
w = initch(x. bit): 

m=(]: 

N=0; 


% Do the Lloyd's iteration until convergence 











w = lHoyd(x,w); 
m= mse(x,w,m); 
save inf m w 
N=N-e+1; 
end 
while | 
w= lloyd(x,w); 
m = mse(x,w,m): 
save inf m w 
N=N+1; 
if abs((m(N-1)-m(N))/m(N-1)) <= 0.01 
break: 
end 
end 


% End of routine 


function wup =lloyd(x,w) 

% LLOYD Lloyd's iteration for VQ 

% LLOYD(X,W) performs the Lloyd iteration for codebook improvement 
% to the given codebook W by using the training data set X. 

% X = Training data 

% W = Previous codebook 


% Alper ERDEMIR 
% June 1993 


L=size(x); 

N =size(w); 

wup = zeros(N(1),N(2)); 
u=zeros(N(2),1); 

y =ones(1,N(2)); 


% Partition the training set into clusters by using NNC 


Opec aired Bete eet thes es ee 

for k=1:L(2) 
d=sum((x(:,k)*y-w).*2); 
[na.iw] = min(d); 
wup(: iw) = wup(:,iw) + x(:,k); 
u(iw) = u(iw) + 1; 

end 


% Take care of the empty clusters ( < =3 training vectors) 
% by splitting the centroid of the clusters with 
% the highest numbers of training vectors 


i=find(u= =O); 
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u(i) = ones(size(1)): 

wup = wup./(u*ones(1,L(1)))"; 

1= find(u <4); 

LY l}=sort(u); 

I =flipud(1); 

for n= | :length(i) 
eps =norm(wup(:,I(n)))*.001; 
wup(:,i(n)) = wup(:,I(n)) + eps; 
wup(:,1(n)) = wup(:.I(n))-eps; 
u(i(n)) =u(E(n)): 

end 


% End of routine 


function m= mse(x,w,m) 


% MSE Computation of mean-square error of a codebook 
%e MSE(X,W,M) computes the mean-square error of the given codebook 


% W, and returns it by appending the current value to an existing 
% vector M, which bas the MSE record for each iteration of 

% the codebook 

% X = Training data 

% W = Current codebook 


% M 


Il 


Vector containing the MSE record 


% Alper Erdemir 
% June 1993 


N =size(w); 

Nx =max(size(x)): 
mse! =0; 

y =ones(1,N(2)): 


for k=1:Nx 
d=sum((x(:,k)*y-w).°2): 
mse { = mse { + min(d); 

end 

mse! = msel/(Nx*N(1)); 

m=([m. mse |; 


% End of routine 


function [z,mse] = code(x,w) 

% CODE coding of data set X using codebook W 

% CODE(X,W) returns the encoded data, Z, and the mean-square error 
%o of coding. 
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% X = Data set to be coded 
% W = Codebook 


% Alper ERDEMIR 
% June 1993 


N =size(w); 

Nx = max(size(x)); 
mse =0; 

y =ones(1 ,N(2)); 
z= zeros(N(1),Nx); 


% Find the closest code vector to each input vector 
for k=1:Nx 
d=sum((x(:,k)*y-w),*2); 
[md,iw] = min(d); 
2(:,k) = Ww ,iw), 
mse = mse + md: 
end 
% Find the mean-square error 


mse = mse/(Nx*N(1)); 


% End of routine 


70 





10. 





LIST OF REFERENCES 


Olivier Rioul and Martin Vetterli, "Wavelets and Signal Processing,“ JEEE Signal 
Processing Magazine, pp. 14-28, October 1991. 


Ingrid Daubechies, Ten Leciures on Wavelets, Society of Industrial and Applied 
Mathematics, Philadelphia, 1992. 


Stephane G. Mallat, "A Theory for Multiresolution Signal Decomposition: The 
Wavelet Representation," JEEE Transactions on Pattern Analysis and Machine 
Intellegence, Vol. 11, No. 7, pp. 674-693, 1989. 


Martin Vetterli and Cormac Herley, "Wavelets and Filter Banks: Theory and 
Design," JEEE Transactions on Signal Processing, Vol. 40, No. 9, 1992. 


Ali N. Akansu and Richard A. Haddad, Multiresolution Signal Decomposition, 
Academic Press, San Diego, 1992. 


Robert M. Gray, "Vector Quantization," IEEE ASSP Magazine, Vol.1, pp.4-29, 
April 1984. 


John Makhoul and others, "Vector Quantization in Speech Coding," Proceedings 
of the IEEE, Vol. 73, No.11, pp. 1551-1558, November 1985. 


Allen Gersho and Robert M. Gray, Vector Quantization and Signal Compression, 
Kluwer Academic Publishers, Boston, 1992. 


Y. Linde and Others, "An Algorithm for Vector Quantizer Design," IEEE 
Transactions on Communications, Vol. COM-28, pp. 84-95, January 1980. 


M. Antonini, and others, "Image Coding Using Vector Quantization in the Wavelet 
Transform Domain," JEEE International Conference on Acoustics, Speech, and 
Signal Processing, Albuquerque, pp. 2297-2300, April 1990. 


M. Antonini, and Others, "Image Coding Using Lattice Vector Quantization of 
Wavelet Coefficients," JEEE International Conference on Acoustics, Speech, and 
Signal Processing, Toronto, pp. 2273-2276, May 1991. 








15. 


Ronald A. DeVore, Bjorn Jawerth, and Bradley J. Lucier, "Image Compression 
Through Wavelet Transform Coding," JEEE Transactions on Information Theory, 
Vol. 38, No. 2, pp. 719-746, March 1992. 


Olivier Rioul and Pierre Duhamel, “Fast Algorithms for Discrete and Continuous 
Wavelet Transforms," JEEE Transactions on Information Theory, Vol. 38, No. 2, 
pp. 569-586, March 1992. 


Ingrid Daubechies, “Orthonormal Bases of Compactly Supported Wavelets," 
Communications on Pure and Applied Mathematics, Vol. XLI, pp. 909-996, 1988. 


C. E. Shannon, "Coding Theorems for a Discrete Source with a Fidelity 
Criterion," JRE National Convention Record, Part 4, pp.142-163, 1959. 


72 






















Defense Technical Information Center 
Cameron Station 
Alexandria VA 22304-6145 


Library, Code 052 
Naval Postgraduate School 
Monterey CA 93943-5002 


Chairman, Code EC 

Department of Electrical and Computer Engineering 
Naval Postgraduate School 

Monterey. CA 93943-5000 


Professor Murali Tummala, Code EC/Tu 
Department of Electrical and Computer Engineering 
Naval Postgraduate School 

Monterey, CA 93943-5000 


Professor Charles W. Therrien, Code EC/Ti 
Department of Electrical and Computer Engineering 
Naval Postgraduate School 

Monterey, CA 93943-5000 


Samuel J. Frazier, Code SY84 
Aircraft Division 

Naval Air Warfare Center 
Patuxent, MD 20670-5000 


Deniz Kuvvetleri Komutanligt 
Personel Daire Baskanligi 
Bakanliklar, Ankara/TURKEY 


Golcuk Tersanesi Komutanligi 
Golcuk, Kocaeli/ TURKEY 


INITIAL DISTRIBUTION LIST 







No. Copies 


to 





Deniz Har Okulu Komutanligi 
81704 Tuzla, Istanbul/TURKEY 


Taskizak Tersanesi Komutanligi 
Kasimpasa, Istanbul/TURKEY 


LTJG Alper Erdemir 
Gunesli Sok. No: 4 Daire: 7 
81090 Erenkoy, Istanbul/TURKEY 





74 





